v6.0 Optimization/Performance Hi-Lights
This page High-Lights the noteable additions to the v6.0 DT
centering on Optimization Issues and Performance Tuning
- Optimization for Real-Time Graphics Applications, 2/96,
examines
- typical application requirements for graphics workstations
- multi-processing issues for graphics subsytems
- graphics workstation pipelines and performance trade-offs
- strategies for diagnosing pipeline bottlenecks
- database structure for traversal
- designing and tuning a real-time application
- run-time diagnostics and load-management strategies
- tools for debugging graphics performance
- MIPSpro(TM) Compiling and Performance Tuning Guide, 3/96,
discusses a variety of issues and tools involved in programming under the
IRIX operating system. It describes the components of MIPSpro compiler
system, other programming tools and interfaces, and dynamic shared
objects. It also explains ways to improve program performance.
- Caching and Locality discusses issues surrounding computer memory, 11/94
- Kernel Processes in IRIX 5.3 and IRIX 6.1, 3/96,
provides a brief description of some of the most commonly asked about
IRIX kernel processes in IRIX 5.3 and IRIX 6.1: shaked, bdflush,
vfs_sync, pdflush, bpqueue and xfsd; none of which have man pages, and
all of which deal with freeing up dynamically allocated kernel memory.
- IRIX Dynamically Loadable Kernel Modules, 3/95,
focuses on what dynamic loadable kernel modules are, which versions of
IRIX support dynamic loadable kernel modules, which IRIX modules are
dynamic loadable kernel modules, when is it appropriate to use dynamic
loadable kernel modules, performance related issues, how to tell if a
dynamic loadable kernel module functions properly, and debugging a
dynamic loadable kernel module.
- Kernel Tuning in IRIX 5.x and IRIX 6.0.x, 7/95,
presents a general discussion on tuning IRIX kernels as well as the
preparation for, and recovery from, an unbootable kernel.
- REACT(TM) Real-Time Programmer's Guide, 3/96.
A real-time program is one that must maintain a fixed timing relationship
to external hardware. In order to respond to the hardware quickly and
reliably, a real-time program must have special support from the system
software and hardware. This guide describes the support that IRIX and the
Silicon Graphics CHALLENGE, Onyx, and POWERCHALLENGE computers provide to
real-time programs. The support bundled with all versions of IRIX is
called REACT. A set of extra-cost features is called REACT/Pro. This
guide covers REACT for IRIX 6.2, and REACT/Pro 3.0.
- Is your X code ready for 64-bit?, 9/95,
focuses on how the coming transition to 64-bit in the workstation world
affects X programmers coding in C or C++. The transition to 64-bit will
be much easier if you are aware of how 64-bit systems affect the X
Window System and the X code you write.
- GLR, an OpenGL
render server facility, for IRIX 6.2 or above
GLR is a network-extensible render service providing OpenGL rendering
into rectangular frame buffer regions. GLR virtualizes access to fast,
high-quality, but expensive rendering hardware for purposes such as
image processing, printing, frame buffer calculations, and off-loaded
high-quality rendering.
- OpenGL Render Serving with GLR, for IRIX 6.2 or above
GLR is an OpenGL based render facility. This provides a mechanism to
share expensive graphics hardware resources between users on a
network. The idea is to amortize the hardware among multiple users to
bring high-quality rendering to a larger application and user
audience. Imagine a network of Indy workstations that use their local
graphics acceleration for most interactive tasks, but can fallback to
GLR on a RealityEngine or InfiniteReality for extremely high-quality
rendering.
- Fast VisSim On Impact Graphics, 4/96
This article describes work done on an example program implementing
geo-specific terrain-following, courtesy of Patrick Bouchaud
(Cortaillod), with program enhancements by the author.
Optimizing VISSIM applications for the Impact graphics platform is an
exciting challenge, similar to the challenge of developing mid-range
graphics: to provide the fullest range of features and the highest
performance possible within severe constraints of cost and space. With
Impact graphics texture-mapping, a fast and powerful system is available
to application developers, but special care must be taken to effectively
utilize this speed and power.
The task of providing platform-specific optimizations for VISSIM is best
left to libraries like the Iris Performer. In fact, work is currently
under way to incorporate this information into Performer. This article
is meant to be generally informative, especially for those who do write
in OpenGL.
- IMPACT Configurations, (from Impact Technical Report, Chapter 4, IMPACT Graphics Subsystem)
- IRIX Dynamically Loadable Kernel Modules, 3/95
IRIX dynamically loadable kernel modules were introduced with the
release of IRIX 5.0. The necessary steps to create and load dynamic
loadable kernel modules is documented in the mload(4), ml(1M), and
lboot(1M) manual pages and the IRIX
Device Driver Programming Guide
for IRIX 5.0 and later releases. This article will only focus on the
following issues related to dynamic loadable kernel modules:
- What are dynamic loadable kernel modules
- Which versions of IRIX support dynamic loadable kernel modules
- Which IRIX modules are dynamic loadable kernel modules
- When is it appropriate to use dynamic loadable kernel modules
- Performance related issues
- How to tell if a dynamic loadable kernel module functions properly
- Debugging a dynamic loadable kernel module
The term module refers to any of the following in this article: a
character or block device driver, streams device driver, streams
module, library module, and kernel debug module (kernel debug modules
are only supported in IRIX 5.2 and above). Modules are either
provided with IRIX or can be developed by users.
- Scalability in the XFS File System, (1/96)
This paper by Adam Sweeney was presented at the January, 1996 USENIX
conference in San Diego, California. Information on future plans and
performance data in this paper are not to be construed as a commitments
by SGI.
- Controlling a Program's Layout with ELSPEC, 11/95
describes how programmers can use the ELF Layout Specification language,
ELSPEC, to override the default layout of text and data sections within
a program.
- SGI Desktop Audio Hardware Performance Specifications (Typical), Indigo², Indy, 3/93
- Runtime Issues
(Chapter 5, from MIPSpro(TM) 64-Bit Porting and Transition Guide, 3/96):
This chapter outlines why your 32-bit and 64-bit applications may run
differently, due both to compiler differences and to architectural
modes. It describes the Performance and Precise Exception Modes of the
R8000 microprocessor architecture and how they affect the calculations
of applications. This chapter also briefly outlines a methodology to
bring up and debug applications.
- IRIX Kernel Tunable Parameters,
(Appendix A, from IRIX Admin: System Configuration and Operation, 3/96):
This appendix describes the tunable parameters that define kernel structures. These structures keep
track of processes, files, and system activity. Many of the parameter values are specified in the files
found in /var/sysgen/mtune and /var/sysgen/master.d.
- Performance,
(Chapter 1, from R10000 Microprocessor User's Manual-Version 1.1, 1/96):
As it executes programs, the R10000 superscalar processor performs many operations in parallel.
Instructions can also be executed out of order. Together, these two facts greatly improve
performance, but they also make it difficult to predict the time required to execute any section of a
program, since it often depends on the instruction mix and the critical dependencies between
instructions.
The processor has five largely independent execution units, each of which are individualized for a
specific class of instructions. Any one of these units may limit processor performance, even as the
other units sit idle. If this occurs, instructions which use the idle units can be added to the program
without adding any appreciable delay.
- Performance Tuning for the R8000
(Chapter 6 from MIPSpro(TM) 64-Bit Porting and Transition Guide, 3/96):
This chapter outlines techniques for tuning the performance of your R8000 applications. It contains
four sections:
- The first section presents the compiler optimization technique of software pipelining, which
is crucial to getting optimal performance on the R8000. It shows you how to read your
software pipelined code and how to understand what it does.
- The second section uses matrix multiplies as a case study on loop unrolling.
- The third section describes the phenomenon of bellows stalls on the R8000 architecture and
gives tips on how to avoid them.
- The final section describes how the IVDEP directive can be used in Fortran to gain
performance.
- Performance Tuning Tools
(Chapter 5, from Programming on Silicon Graphics Systems: An Overview, 3/96):
IRIX provides several tools that you can use to optimize the performance of your application. Table
5-2 [prof, pixie, par, cord, xscope] summarizes these tools and the paragraphs following
the table describe each tool in greater detail.
- System Administration for Guaranteed-Rate I/O
(Chapter 9, from IRIX Admin: Disks and Filesystems, 3/96):
Guaranteed-rate I/O, or GRIO for short, is a mechanism that enables a user
application to reserve part of a system's I/O resources for its exclusive use. For
example, it can be used to enable "real-time" retrieval and storage of data
streams. GRIO manages the system resources among competing applications, so
the actions of new processes do not affect the performance of existing ones.
GRIO can read and write only files on a real-time subvolume of an XFS
filesystem. To use GRIO, the subsystem eoe.sw.xfsrt must be installed.
This chapter explains important guaranteed-rate I/O concepts, describes how to
configure a system for GRIO, and provides instructions for creating an XLV
logical volume for use with applications that use GRIO.
The major sections in this chapter are:
- "Guaranteed-Rate I/O Overview"
- "GRIO Guarantee Types"
- "GRIO System Components"
- "Hardware Configuration Requirements for GRIO"
- "Configuring a System for GRIO"
- "Additional Procedures for GRIO"
- "GRIO File Formats"
- System Performance Tuning
(Chapter 11, from IRIX Admin: System Configuration and Operation, 3/96):
This chapter describes the basics of tuning the IRIX operating system for the
best possible performance for your particular needs. Information provided
includes the following topics:
- General information on system tuning and kernel parameters. See "Theory of System Performance Tuning".
- Tuning applications under development. See "Application Tuning".
- Observing the operating system to determine if it should be tuned. See "Monitoring the Operating System".
- Tuning and reconfiguring the operating system. See "Tuning The Operating System".
- System-Specific Tuning
(Chapter 14, from OpenGL on Silicon Graphics Systems, 3/96):
This chapter provides tuning information that's relevant for particular Silicon
Graphics systems. Use these techniques as needed if you expect your program to
be used primarily on one kind of system, or a group of systems. The chapter
discusses:
- "Optimizing Performance on Low-End Graphics Systems"
- "Optimizing Performance on Mid-Range Systems"
- "Optimizing Performance on Indigo2 IMPACT Systems"
- "Optimizing Performance on RealityEngine Systems"
Some points are also discussed in earlier chapters but repeated here because they
result in particularly noticeable performance improvement on certain platforms.
- Tuning Graphics Applications: Examples
(Chapter 13, from OpenGL on Silicon Graphics Systems, 3/96):
This chapter first presents a code fragment that helps you draw pixels fast. The
second section steps through an example of tuning a small graphics program,
showing changes to the program and discussing the speed improvements that
result.
- Tuning Graphics Applications: Fundamentals
(Chapter 11, from OpenGL on Silicon Graphics Systems, 3/96):
Tuning your software makes it use hardware capabilities more effectively. This
chapter looks at tuning graphics applications. It discusses pipeline tuning as a
conceptual framework for tuning graphics applications, and introduces some
other fundamentals of tuning:
- "Why Is Tuning Useful?"
- "What Is Pipeline Tuning?"
- "Tuning Animation"
- "Optimizing Cache and Memory Use"
- "Taking Timing Measurements"
Writing high-performance code is usually more complex than just following a
set of rules. Most often, it involves making trade-offs between special
functions, quality, and performance for a particular application.
- Tuning the Pipeline
(Chapter 12, from OpenGL on Silicon Graphics Systems, 3/96):
This chapter looks in some detail at tuning the graphics pipeline. It presents a
variety of techniques for optimizing the different parts of the pipeline,
providing code fragments and examples as appropriate. You learn about:
- "CPU Tuning: Basics"
- "CPU Tuning: Immediate Mode Drawing"
- "CPU Tuning: Display Lists"
- "CPU Tuning: Advanced Techniques"
- "Tuning the Geometry Subsystem"
- "Tuning the Raster Subsystem"
- "Tuning the Imaging Pipeline"
- Benchmarking Libraries: libpdb and libisfast
(Appendix C, from OpenGL on Silicon Graphics Systems, 3/96):
Overview
Libraries for Benchmarking
Using libpdb
Using libisfast
- Models of Parallel Computation,
(Chapter 3, from Topics in IRIX Programming, 3/96):
Silicon Graphics, Inc., makes multiprocessor computer systems. You can use
any of several programming models to exploit the parallel capabilities of the
hardware. This chapter reviews the parallel programming models, supplying
enough information that you can select one model. Pointers to more detailed
documentation of each model are included. The major topics are:
- "Parallel Hardware and Programming Models" provides a quick survey of
the programming models and their relationship to the hardware.
- "Using Statement-Level Parallelism" discusses using fine-grained
parallel execution in Fortran and C.
- "Using Process-Level Parallelism" provides an overview of the use of
coordinated UNIX processes for parallel execution.
- "Using MPI and PVM" compares these two interfaces.
- Multiprocessing Support
(Chapter 1, from MIPSpro(TM) 64-Bit Porting and Transition Guide, 3/96):
IRIX 6.2 and the MIPSpro compilers support multiprocessing primitives for
both 32-bit and 64-bit applications. The 64-bit multiprocessor programming
environment is a superset of the 32-bit one. It also contains enhancements.
- MP Compatibility
- MP Enhancements
- MP Application Testing
- KAP's Optimization Flags (-o, -r, -so)
(Chapter 1, from MIPSpro(TM) 64-Bit Porting and Transition Guide, 3/96):
Kap accepts three important optimization flags, -o, -r, and -so. They are
described in this section.
- Opimization Levels Passed to KAP
(Chapter 1, from MIPSpro(TM) 64-Bit Porting and Transition Guide, 3/96):
Three optimization flags are passed to KAP; their values are dependent on the
optimization level specified as described below.
- Optimization Switches of the 64-Bit Compilers
(Chapter 4, from MIPSpro(TM) 64-Bit Porting and Transition Guide, 3/96):
In addition to the switches listed above, both the 64-bit Fortran and 64-bit C
compilers support many more switches. These switches are used to control the
types of optimizations the compilers perform. This section outlines the various
optimizations that the 64-bit compilers can perform, and lists the switches used
to control them.
Copyright © 1995-96,
Silicon Graphics, Inc.